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Serial No: 



09/533,049 



Filed: 



March 22, 2000 



Examiner: A. A. Boutah 



Title: 



SYSTEM AND METHOD FOR RECORDING A PRESENTATION FOR ON- 
DEMAND VIEWING OVER A COMPUTER NETWORK 



REPLY BRIEF 



Bellevue, Washington 98004 
December5, 2005 



TO THE DIRECTOR OF THE PATENT AND TRADEMARK OFFICE: 

This document is a Reply Brief in an appeal of a final rejection of the above-identified patent 
application and is responsive to the Examiner's Answer dated November 02, 2005. The Board is 
requested to consider the following remarks in reaching a decision in this appeal. 



The Combination of Dyson, Klemets and Gomez Fails to Teach or Suggest Automatic Time 
Indexing When Live Content is Captured or Data Stream is Produced. 

The combination of Dyson, Klemets and Gomez do not teach or suggest automatic time 
indexing when live content is captured or when the data stream is produced. With respect to the 
Gomez reference, appellants respectfully disagree with the Examiner's response on page 22 of the 
Examiner's Answer that Gomez discloses that automatic time indexing is performed when live 
content is captured or the data stream is produced. 

In the Examiner's Answer, the Examiner asserts that Gomez discloses a full multimedia 

production, such as a seminar, conference, and lecture that can be captured in real-time and is 

handled automatically in the background, and is thus shielded from the user. The portions of Gomez 

that the Examiner has cited in support of her assertion are underlined below and are identified for 

reference hereinafter as a first, second, third, and fourth citation. 

A full multimedia production such as a seminar, conference, lecture, etc. can be 
captured in real time using multiple cameras. A live movie of a speaker together with 
the speaker's flipping still images or slide show can be viewed interactively within the 
same video display screen. The complete production can be stored on a hard drive for 
retrieval on demand, or sent live to a host server for live distribution throughout a data 
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network. It is also possible to store the complete presentation on portable storage 
media and/or to send the complete presentation as an e-mail. (Gomez, Abstract, 
referred to hereinafter as the first citation.) 

According to the principles of the invention, the tools are handled automatically in the 
background, shielded from the user, and the encoding is done in real-time. The 
synchronization points are set when the event is really happening. In one example, 
overhead-projector plastic slides, computer VGA-graphics, whiteboard drawings, etc. 
are captured and converted to JPEG, and the video encoding is done in MPEG and 
stored together with the sound and synchronization points in an ASF file for RTSP 
(Real Time Streaming Protocol; see RFC 2326 published by IETF (www.IETF.org)) 
streaming. (Emphasis added, Gomez, column 2, lines 25-35, referred to hereinafter as 
the second citation.) 

As shown in FIG. 1, an exemplary system according to principles of the invention for 
automated conversion of a visual presentation into digital data format includes video 
cameras 11 and 13, a microphone 12, an optional lap top computer 10, and a digital 
field producer unit 14, also referred to herein as DFP unit or DFP computer. One of 
the video cameras 13 covers the speaker and provides video information to the live 
video section 1, and the other video camera 11 covers the slide show, flip chart, white 
board, etc. and provides the corresponding video information to the still video section 
3. The microphone provides the audio to the sound section 2. In the example DFP unit 
of FIG. 1, the live video is encoded 4 (e.g., in MPEG) in real time during the speaker's 
visual presentation, and the still video of the slide show etc. is converted 5 into JPEG 
files in real time during the presentation. (Gomez, column 3, lines 25-40, referred to 
hereinafter as the third citation.) 

A synchronizing section 16 of FIG. 1 operates automatically during the speaker's 
presentation to synchronize the still video information from the slide show, flip chart 
etc. with the live video information from the speaker. Both the live video and the still 
video can then be streamed live through a server 15 to multiple individual users via a 
data network 18 such as, for example, the Internet, a LAN, or a data network including 
a wireless link. (Gomez, column 3, lines 41-49, referred to hereinafter as the fourth 
citation.) 

The Examiner's second, third, and fourth citations from Gomez respectively teach that: (1) 
the tools are handled automatically in the background; (2) there is an automated conversion of a 
visual presentation into a digital data format; and, (3) the synchronizing section operates 
automatically during the speaker's presentation. However, Gomez's synchronization, which is 
disclosed as operating automatically, is not equivalent to appellants' claim recitation of automatically 
time indexing when live content is captured or when the data stream is produced. 
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Time indexing is a function that enables synchronization as indicated in appellants' 



subparagraphs (c) of Claim 1, (d)(ii) of Claim 16, (d)(i) of Claim 20, and (b) of Claim 24. Similarly, 

time indexing is a function that enables the associated presentation slide to be displayed at 

substantially an identical time relative to when the slide was displayed during the live portion, as 

recited in subparagraph (g) of Claim 9. As disclosed in regard to an exemplary preferred 

implementation in appellants' specification: 

In a preferred implementation of the present invention, each script command will be 
indexed so as to be synchronized with the nearest prior keyframe during a post-processing 
operation on the saved ASF stream file. This indexing of the script commands is performed 
as follows. First, a time index with a one-second granularity or resolution is encoded into 
the data stream, when the data stream is originally produced. Next, the keyframes are 
assigned a time index value based on their respective time stamps. Finally, each script 
command is indexed to the nearest prior keyframe time index value based on the script 
command's inherent time stamp location in the ASF stream. As a result of applying these 
steps to the foregoing exemplary events, the script command corresponding to slide 1 will 
be indexed to a time index value of 8 seconds since the nearest prior keyframe time index 
value corresponds to an 8 second time index value. Similarly, the script command 
corresponding to slide 2 will be indexed to a time index value of 24 seconds (its nearest 
prior keyframe time index value), and the script command corresponding to slide 3 will be 
indexed to a time index value of 58 seconds (its nearest prior keyframe time index value) 
(see appellants' specification, page 42, lines 19-33). 



As should be evident from this description, time indexing is not equivalent to synchronization, 
but instead, is a technique recited by appellants' claims, which enables synchronization to be 
achieved. 

Furthermore, the Examiner asserts that the prior art teaches that a live, multimedia production 
can be encoded and assembled into a document file, such as an ASF file. The portion of Gomez that 
the Examiner has cited in support of her assertion is underlined below and is referred to hereinafter as 
the fifth citation: 

The DFP application section 19 further includes an encoder and streamer 
module 27 which receives the digital video output from video grabber card 20, and 
continuously encodes and compresses this data into a digitally transferrable stream 
with low bandwidth. The corresponding audio information from the audio input 
section is also encoded and compressed into the digitally transferrable stream. The 
encoding process is also conventionally referred to as streaming or streaming video. 
Encoding modules such as shown at 27 are conventionally known in the art. One 
example is the NetShow encoder and streamer conventionally available from 
Microsoft. In one example, the video encoding can be done in MPEG. The encoder 
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and streamer module 27 can assemble the encoded video data in a document file, for 
example an ASF file. (Emphasis added, Gomez, column 4, line 65-column 5, line 12, 
referred to hereinafter as the fifth citation.) 

In this case, the Examiner asserts that the assembling of the ASF is interpreted as "time 

indexing" because it includes frames and corresponding timestamps which allow a user to control 

passage of time during encoding. The Examiner cites column 5, lines 25-28 and lines 50 to 61. The 

Examiner also cites column 7, lines 14-35 and these portions of Gomez are reproduced below and 

respectively referenced hereinafter as the sixth, seventh, eighth, and ninth citations. 

The ASF file can be output from the encoder and streamer module 27 for live 
streaming out of the DFP unit 14, and also for storage at 171 in the storage unit 17. 
The encoder and streamer module 27 also encodes the digitized audio signal received 
from the audio input section. The encoded video information is also output from the 
encoder and streamer module 27 to a streamed image display portion 260 of the GUI 
30, whereby the streaming video can be displayed on the monitor 34 via the video card 
24. The encoder and streamer module 27 receives a control input from an encoder 
control portion 36 of the GUI 30. The encoder control portion 36 permits a user, via 
the user command input and serial card 31, to control starting and stopping of the 
encoding process. In addition, the encoder control 36 provides a recording counter 
which tracks the passage of time during the encoding of the video event. (Emphasis 
added, Gomez, column 5, lines 13-28, referred to herein after as the sixth citation.) 

The HTML file name, hhmmss.htm, is then sent as a relative URL (Uniform Resource 
Locator) from generator 26 to the encoder and streamer 27 for inclusion, at time stamp 
hhmnmss, in the encoded streaming video data (e.g., in an ASF file) output by the 
encoder and streamer 27. This synchronizes the still video information from the slide 
show with the "live" video information from the speaker. In addition, other files can 
be synchronized to the "live" video, such as sound, VRML, JAVA script, text files, 
voice-to-text files and files containing translations of voice-to-text files into other 
languages. (Gomez, column 5, lines 50-61, referred to hereinafter as the seventh 
citation.) 

The web browser 40 preferably includes an ASF player, executing as a plug-in or an 
ActiveX control, that processes the ASF file and presents the audio/video to the 
viewer. When the player, for example a conventional multimedia player such as 
Microsoft Windows Media Player, encounters a Script Command Object in the ASF 
file, it interprets and executes Script Command Object. When the player identifies the 
Script Command Object as a URL, it passes the URL to the browser. The browser 
executes the URL as if it had been embedded inside an HTML document. According 
to one embodiment, the URL points to HTML document hhmmss.htm, which in turn 
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contains a pointer to the corresponding JPEG document hhmmss.jpg. (Gomez, 
column 7, lines 14-26, referred to hereinafter as the eighth citation.) 

If the Windows Media Player control is embedded in an HTML file that uses frames, 
the URL can be launched in a frame that is also specified by the Script Command 
Object. This allows the Windows Media Player control to continue rendering the 
multimedia stream in one frame, while the browser renders still images or Web pages 
in another frame. If the Script Command Object does not specify a frame, then the 
URL can be launched in a default frame. (Gomez, column 7, lines 27-34, referred to 
hereinafter as the ninth citation.) 

In the seventh citation, Gomez discloses a file name - but not a frame - that is sent for 
inclusion at time stamp hhmnmss, in the encoded streaming video data output by the encoder and 
streamer. Furthermore, the time stamp corresponds to the system time on which the HTML file name 
was based (Gomez, column 6, lines 24-25). Appellants do not perceive where Gomez teaches or 
suggests that time stamps correspond to or are equivalent to frames. 

In the sixth citation, Gomez discloses that a recording counter tracks the passage of time and 
is provided by the encoder control. However, Gomez does not expressly teach or suggest that time 
stamps are employed by the recording counter to keep track of time. This citation also discloses that 
a user can start and stop the encoding process. Appellants respectfully disagree that Gomez teaches 
that Gomez includes frames and corresponding time stamps that enable a user to control passage of 
time during the encoding. A user may start and stop the encoding process in Gomez, but the manner 
in which this happens is not disclosed in Gomez and it is inappropriate to assume that time stamps are 
employed for this purpose. 

In the ninth citation, frames are disclosed, but the context of a "frame" appears to be used 
differently in Gomez than in appellants' claims. Gomez's frame appears to be a feature that divides a 
browser's display into separate windows that can be scrolled independently of each other such that a 
multimedia stream can be rendered in one frame (or window) while Web pages are rendered in 
another frame (or window). In contrast, appellants' claims refer to a keyframe and a deltaframe, 
which are not simply display windows appearing in a browser application. 

With respect to the Dyson reference, the Examiner has also asserted that Dyson discloses time 
indexing in "Using the ASF Editor" that enables a user to place audio and video files into the timeline 
and refers to "Using NetShow Live Administrator" as enabling users to record live audio. However, 
enabling a user to manually place audio and video files into a timeline (i.e., "you can place event on a 
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timeline") teaches a manual process and not an automated process. Furthermore, Dyson simply states 
that the ASF Editor is used to combine and synchronize images, audio and script commands. As 
described above, appellants' claim recitation of "time indexing" is different than the term 
"synchronization." 

With respect to the Examiner's reliance on the Klemets reference, appellants' continue to rely 
on the traverse presented by their arguments on pages 7-9 in their Appeal Brief. 

Therefore, for the reasons noted above, the combined references of Dyson, Klemets, and 
Gomez fail to teach or suggest automatic time indexing when live content is captured or a data stream 
is produced. 

The Combination of Dyson, Klemets and Gomez Fails to Teach or Suggest Keyframes and 
Deltaframes. 

Appellants respectfully disagree with the Examiner's statements on page 22 of the Examiner's 

Answer, asserting that Gomez discloses keyframes and deltaframes, in column 8, line 49-column 9, 

line 6. The portion of Gomez cited by the Examiner is reproduced below: 

The still image grab/convert portion 610 provides the pixel data received from 
the video grabber card 23 to a data storage section at 650 and 660. Each time a still 
image is grabbed, the pixel data is provided to a current picture storage section 650 
whose previous contents are then loaded into a last picture storage section 660. In this 
manner, the pixel data associated with the current still image and the most recently 
grabbed previous still image (i.e., the last still image) are respectively stored in the 
data storage sections 650 and 660. A difference determiner receives the current and 
last picture data from the storage sections 650 and 660, and determines a difference 
measure, if any, between the current still image and the last still image. If the 
difference determiner determines that a difference exists between the two still images, 
then information indicative of this difference is provided to a threshold portion 640, 
which compares the difference to a threshold value to determine whether the images 
differ enough to warrant creation of a new JPEG file corresponding to the current 
image. If the difference information received from difference determiner 630 exceeds 
the threshold of threshold portion 640, then the output 690 of threshold portion 640 is 
activated, whereby the create file signal 680 is activated by operation of an OR 
gate 685 that receives the threshold output 690 as an input. The OR gate 685 also 
receives as an input the manual/automatic signal from FIG. 5, whereby the file 
creator 620 can be directed to create a JPEG file either by activation of the threshold 
portion output 690 or by a "manual" indication from the manual/automatic signal. 
(Gomez, column 8, line 49-column 9, line 6.) 
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The Examiner further asserts that Gomez discloses a difference determiner receiving the 
current and last picture data and determines a difference measure between the current still image and 
the last still image, and if there is enough difference between the current and last picture data, a new 
JPEG file is produced. The Examiner indicates that appellant defines keyframes as video frames that 
comprise new data, while deltaframes comprise data corresponding to the difference between the 
current frame and its immediate preceding frame. Thus, the Examiner concludes, Gomez's new 
JPEG file is therefore interpreted as a "keyframe," while the data corresponding to the difference 
between the current and last images are interpreted as corresponding to "deltaframes." 

Appellants respectfully disagree with this assertion. Appellants' step (b) of Claim 1 recites 
(with emphasis added), "automatically embedding the slide display commands into a data stream as 
the data stream is produced, the data stream comprising data corresponding to the live portion of the 
presentation, wherein the live content is captured as a plurality of video frames comprising a 
plurality of keyframes and deltaframes'' Subparagraph (a)(i) of Claim 9 includes a similar recitation. 
Therefore, it is evident from the claim language that the plurality of keyframes and deltaframes are 
included in the video frames captured as live content. 

As is apparent from the Examiner's first citation (supra) to the Abstract of Gomez, Gomez 
discloses that a live movie together with the speaker's flipping still images or slide show can be 
viewed. The live movie is produced in response to the video signal from live video camera 13 by 
encoder and streamer 27 (see FIGURE 2 of Gomez). In the second citation, Gomez states that the 
live video is encoded in real time during the speaker's visual presentation, and the still video of the 
slide show is converted into JPEG files in real time. Furthermore, Gomez discloses that during 
replay broadcasts, the web server retrieves and forwards the stored ASF file (containing the 
encoded/compressed "live" video data), accesses the stored HTML documents, and retrieves and 
forwards the stored JPEG documents (Gomez, column 7, lines 50-54). However, the stored JPEG 
files are produced by still image grab/converter 21. Gomez indicates that either the user can 
manually initiate the production of each new JPEG file of a still slide or that the process can be 
automated in response to the difference between successive still image frames of the slides (where the 
difference indicates that the slides have changed during the live presentation). Gomez only produces 
JPEG files of the slides and there is no teaching that these JPEG files include keyframes or 
deltaframes, which they would not, since they are simply static JPEG files. Thus, the JPEG files 
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utilized by Gomez are not equivalent to keyframes and deltaframes that are included in the video 
frames that are captured as live content, as recited in appellants' claims. 

With respect to Dyson, the Examiner asserts that support of time indexing of video frames can 
be found in the teachings of Dyson (Using the ASF Editor: allows the user to place audio and video 
files into the timeline). However, Dyson does not teach or suggest any video frames that include 
keyframes and deltaframes. 

Therefore, it is clear for the reasons given above that the combined references of Dyson and 
Gomez fail to teach or suggest keyframes and deltaframes as recited by appellants' claims. 
The Combination of Dyson and Gomez Fail to Teach or Suggest Generation of Slide Display 
Commands in Response to Slide Triggering Events. 

The Combination of Dyson and Gomez fail to teach or suggest generation of slide display 
commands in response to slide triggering events. With respect to Dyson, appellants respectfully 
disagree with the Examiner's statement on page 23 of the Examiner's Answer wherein the Examiner 
asserts that Dyson discloses (in Creating Netshow Content) allowing the user to embed scripting 
commands into an .asf file so that one can use it to open web pages and send script commands to 
clients, open URLs, and manage input and feedback from users. 

The Examiner also asserts that Gomez teaches a user's command input, allowing the user to 
flip through still images or slideshows. The Examiner cites the abstract and column 3, lines 33 
through 47, and column 4, lines 60-64 of Gomez. It apparently is the Examiner's position that a 
user's command input to flip images corresponds to a "slide triggering event," and in response to the 
user's command input, the Examiner indicates that Gomez discloses inserting a script command 
object into an ASF file to control the display of images. The Examiner cites column 6, lines 1-4; 
column 7, lines 18 to 30; and column 8, lines 1-5 in support of her assertion. The Examiner indicates 
that the script command is interpreted as a "slide display command." 

It appears that the Examiner has misconstrued the teachings of Gomez. As noted above, 
Gomez teaches that a user can manually cause still image grabber and converter 21 to grab a still 
image of a slide or a flip chart. The control of the still image grabber and converter with a user input 
is not equivalent to a presenter issuing a slide change command to change the slide being shown 
during a live presentation (which is the slide triggering event referenced in the appellants' claims). 
The script command object disclosed in Gomez is used to insert a URL into the ASF file, as clearly 
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stated in Gomez at column 5, line 65 through column 6, line 4. Gomez does not teach or suggest that 
a script command object is inserted into the ASF file in response to a slide display command input by 
a user to change slides during a live performance. 

With respect to Dyson, any command referred to therein is not related to the display of slides 
and is not produced in response to a slide triggering event. Dyson's commands appear to open Web 
pages and URLs or manage input and feedback, etc. Dyson's commands do not appear to be related 
in any way to the display of slides. Furthermore, Dyson does not teach or suggest that the commands 
correspond to any specific event, such as a slide triggering event. Dyson appears to disclose 
embedding scripting commands at a user's discretion. 

Therefore, for the reasons noted above, the combined references of Dyson and Gomez fail to 
teach or suggest generation of slide display commands in response to slide triggering events. 
The Combination of Dyson and Gomez Fails to Teach or Suggest Controlling the Display of 
Slides during Playback. 

Appellants respectfully disagree with the Examiner who asserts on page 24 of the Examiner's 
Answer that Dyson and Gomez teach controlling the display of slides during playback. The 
Examiner asserts that Dyson (in Overview) teaches that users can fast forward quickly and easily to a 
specific point of interest and cites the last two lines of this portion of the reference. In addition, the 
Examiner asserts that Gomez discloses PowerPoint slides being converted into JPEG (Gomez, 
column 1, lines 51-54). Thus, during replay, the Examiner asserts, a user can navigate to any point of 
the presentation (Gomez, column 2, lines 13-24; column 6, lines 61 to column 7, line 8). 

However, Dyson discloses: 

You will also occasionally see the abbreviation ASF used for Active Movie 
Streaming Format or Active Streaming Format rather than ActiveX Streaming Format; 
the initials and the format are the same, and that is what really matters. Active 
Streaming Format (ASF) allows you to deliver multimedia content on your corporate 
intranet or to the Internet. With ASF files, you can provide audio, illustrated audio, or 
video at various rates. You can also open Web pages, add scripting commands, and 
even add markers so that your users can fast forward quickly and easily to a specific 
point of interest. (Dyson, Chapter 8: Adding Audio and Video with Netshow.) 

Dyson does not teach or suggest controlling the display of slides in the above citation. Also, 
the "navigation" or fast forwarding referred to in Dyson is not the same as appellants' "controlling 
display," since there is no suggestion in Dyson that the display of slides be controlled. Fast 
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forwarding to a specific point in a streaming format, as disclosed in Dyson, is not the same action as 
controlling the display of slides to replicate their display as it occurred in the live presentation, during 
playback. 

Furthermore, Gomez does not teach or suggest that an "actual" slide presented during the live 
presentation is even available during the playback. A PowerPoint slide that is converted into a JPEG 
file is not an actual "slide" that is shown during the live presentation and also during the playback. 
Therefore, Gomez does not teach or suggest appellants' claim recitation of "controlling display of 
slides during playback." 

From the preceding discussion, it will be apparent that the combination of Dyson, Klemets 
and Gomez does not teach or suggest appellants' claim recitation of automatic time indexing when 
live content is captured, or that a data stream is produced, or teach or suggest appellants' claim 
recitation of a plurality of keyframes and deltaframes comprising a data stream. In addition, the 
combination of Dyson and Gomez do not teach or suggest the generation of slide display commands 
that are in response to slide triggering events and are included in the data stream, or teach or suggest 
controlling display of slides during playback of a presentation. 

Accordingly, the Examiner's position in rejecting the claims on appeal is without merit and 
appellants again ask that the Board overrule the Examiner's rejection of these claims and instruct the 
Examiner to pass the application to issue without delay. 
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